Learning Mixtures of Tree-Unions by Minimizing Description Length

نویسندگان

  • Andrea Torsello
  • Edwin R. Hancock
چکیده

This paper focuses on how to perform the unsupervised learning of tree structures in an information theoretic setting. The approach is a purely structural one and is designed to work with representations where the correspondences between nodes are not given, but must be inferred from the structure. This is in contrast with other structural learning algorithms where the node-correspondences are assumed to be known. The learning process fits a mixture of structural models to a set of samples using a minimum description length formulation. The method extracts both a structural archetype that desribes the observed structural variation, and the node-correspondences that map nodes from trees in the sample set to nodes in the structural model. We use the algorithm to classify a set of shapes based on their shock graphs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Mixtures of Weighted Tree-Unions by Minimizing Description Length

This paper focuses on how to perform the unsupervised clustering of tree structures in an information theoretic setting. We pose the problem of clustering as that of locating a series of archetypes that can be used to represent the variations in tree structure present in the training sample. The archetypes are tree-unions that are formed by merging sets of sample trees, and are attributed with ...

متن کامل

Comparison of Artificial Neural Network, Decision Tree and Bayesian Network Models in Regional Flood Frequency Analysis using L-moments and Maximum Likelihood Methods in Karkheh and Karun Watersheds

Proper flood discharge forecasting is significant for the design of hydraulic structures, reducing the risk of failure, and minimizing downstream environmental damage. The objective of this study was to investigate the application of machine learning methods in Regional Flood Frequency Analysis (RFFA). To achieve this goal, 18 physiographic, climatic, lithological, and land use parameters were ...

متن کامل

A Mixed Integer Programming Approach to Optimal Feeder Routing for Tree-Based Distribution System: A Case Study

A genetic algorithm is proposed to optimize a tree-structured power distribution network considering optimal cable sizing. For minimizing the total cost of the network, a mixed-integer programming model is presented determining the optimal sizes of cables with minimized location-allocation cost. For designing the distribution lines in a power network, the primary factors must be considered as m...

متن کامل

Minimal Cost Complexity Pruning of Meta-Classifiers

Integrating multiple learned classification models (classifiers) computed over large and (physically) distributed data sets has been demonstrated as an effective approach to scaling inductive learning techniques, while also boosting the accuracy of individual classifiers. These gains, however, come at the expense of an increased demand for run-time system resources. The final ensemble meta-clas...

متن کامل

Transfer Learning Using the Minimum Description Length Principle with a Decision Tree Application

Transfer learning is about how learning from one domain or a collection of domains can be applied to another. It is learning from similarities and parallels, from experience. This paper is about a distribution free, data driven, extendable framework for transfer learning, based on the minimum description length principle. We define transfer learning in terms of a specific framework, where we ha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003